IDEAS COBOL Maintenance Technologies is a group of COBOL programmers dedicated to developing and maintaining high quality software. Our quest to find new ways of improving software maintainability has provided some useful observations which we'd like to share with you. Why is source code so important? ================================================================================ Applications software is a repository of accumulated business knowledge. The source code is the business specification. ================================================================================ There is a strong correlation between the quality and efficiency of software management, and our ability to manipulate an application's source code. Significant short and long term benefits are available through: 1. Improving the programmer-to-source code interface by using repository based system analysis and documentation software. 2. Increasing the quality and maintainability of our applications software by reducing redundancies, and isolating system and business functions at the source code level. Quality is not a one-time activity ================================================================================ Nothing affects system maintainability more than the line-by-line evolution of the source code. The constant and never ending refinement of system and business functions is the only sure way to attain and maintain software quality. ================================================================================ System-wide Analysis and Documentation Basics The most important feature of system analysis and documentation software is that it provide system-wide analysis capability. This requires the ability to load application source code into a repository that can be accessed by a flexible interactive viewing facility and by extensive reporting capabilities. As source code is loaded into the repository, it is separated into component categories--typically data-names, paragraphs, and members--and the interrelationships between these components are captured. Once loaded into the repository, each system component provides insight into the quality and consistency of system functions. This information is especially useful for impact analysis, system understanding, quality assurance, and provides a solid base for re-engineering efforts. This document presents examples of system information that is available from the repository, and shows how this information can be exploited to build quality into existing COBOL systems. Table of Contents Data Information File-to-Program Interface. . . . . . . . . . . . . . . . . . . . . .4 Program-to-Program Interface . . . . . . . . . . . . . . . . . . . .4 Data Items That Control Processing Routines. . . . . . . . . . . . .5 Data Items That Store Application Data . . . . . . . . . . . . . . .5 Data Items That Share Data . . . . . . . . . . . . . . . . . . . . .6 Similar Data-names . . . . . . . . . . . . . . . . . . . . . . . . .6 Paragraph Information Storing Functions in Paragraphs. . . . . . . . . . . . . . . . . . .7 Procedure Copybook Paragraph Usage . . . . . . . . . . . . . . . . .7 Data Items Referenced in a Paragraph . . . . . . . . . . . . . . . .8 Similar Paragraph-names. . . . . . . . . . . . . . . . . . . . . . .8 Standardize the Data Modularize the Processes ================================================================================ All reports mentioned in IDEAS are available with GPSA. In the bad old days, before we built GPSA, virtually all these reports were produced manually by our maintenance programmers using a search utility and a mainframe style program editor. Working on software maintenance without GPSA is okay if you are being paid by the hour and have lots of time, but not if results are important to you and your organization. ================================================================================ Table of Contents Members Information Member Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . 9 Program Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Copybook Usage . . . . . . . . . . . . . . . . . . . . . . . . . . .10 System Analysis by Data Type . . . . . . . . . . . . . . . . . . . .10 Data Items Referenced Within Member. . . . . . . . . . . . . . . . .11 Storing Functions in Members . . . . . . . . . . . . . . . . . . . .11 Additional Topics Linking External to Internal System Elements . . . . . . . . . . . .12 Locate Selected Verbs. . . . . . . . . . . . . . . . . . . . . . . .13 Alphanumeric Literals in the Procedure Division . . . . . . . . . .13 Flexibility + Maintainability = Quality Data Information ( 4 ) File-to-Program Interface An essential element of any file-to-program interface is consistency. All programs that access a file must use the same file description or record definition. This is accomplished by having one record definition, placed in a copybook, for each file in the system. When a record definition is updated, it is easy to locate all programs that are affected by the change; simply find all references to the copybook containing the updated record definition on the Copy Cross-Reference report. Each program that contains a reference must be compiled. How to find record definitions that are not in copybooks: 1. The Hard-Coded Record Definitions report lists all record definitions that are not located in copybooks. 2. The Data Areas Referenced by File Access Verbs report lists all data definitions referenced by file access verbs and their parameters (i.e. KEY IS, INTO, and FROM). Use this report to locate all data definitions that are not in copybooks. Program-to-Program Interface An essential element of any program-to-program interface is consistency. This interface, or linkage, usually consists of one or more data areas. All data definitions that are used in the program-to-program interface (i.e. CALL USING DATA-AREA and PROCEDURE DIVISION USING DATA-AREA) must be located in copybooks. Consider the situation where a change is required to the program-to-program interface, but the linkage area is coded in several programs: it would be more difficult to correctly change several linkage areas than a single linkage area located in a copybook. Problems which occur as a result of inconsistent linkage areas are extremely difficult, time-consuming, and costly to correct. When copybook linkage areas are updated, it is very easy to locate all programs that are affected by the change; simply find all references to the updated copybook on the Copy Cross-Reference report. Each program that contains a reference must be compiled. How to find linkage data definitions that are not in copybooks: 1. The Hard-Coded Linkage Sections report lists all data areas that are located in the LINKAGE SECTION of a CALLed program. 2. The Call Cross-Reference report lists all CALL statements along with their parameters (e.g. USING DATA-01 DATA-02). Starting with the data areas listed on this report, and viewing the All Data-names screen which shows where each data area is located, it is easy to identify all data definitions that are not located in copybooks. Data Information ( 5 ) Data Items That Control Processing Routines Many data items are used to control the execution of processing routines. These data items may be dates, indicators, transaction codes, or combinations of the various types. A large proportion of system bugs are the result of inconsistent operations being perform on these data items. For example, inconsistent or invalid data values are moved into them, or there are several different methods of testing their values. On the bright side, these data items provide a unique and highly focused look at the system elements responsible for executing a specific process--even though the elements are located in routines and programs that appear to be completely unrelated. The References to Data-Name report lists all procedure code that references one or more selected data-names. This is an important quality assurance report for maintenance projects and for the latter stages of system development, when processing routines produce unexpected results due to data items which contain incorrect values--and no one on the project remembers all the places where the data items are updated. The Associated Data-Names report lists the data definition of each data item that is associated with one or more selected data items. Associated data items are data items that are moved into, receive data from, or are compared with, another data item. Using this report, we can view the initial values of the associated data items, and check that all definitions are consistent (e.g. all associated data items are defined as five digit integers). Data Items That Store Application Data Data items that store application data are typically used by application specific processing routines and, with the exception of temporary data areas, should be located in copybooks. Business analysts and system re-developers are keenly interested in the information provided by application data. Business rules, proprietary calculations, and application specific processing routines are all attached to application data. There are two reports which can be used to analyze application data: References to Data-Name which reports on up to 30 individually selected data items, and References to All Data-Names in Member which reports on all data items in a copybook or program. Both reports list the procedure code that references the selected data items. Report options make it possible to list references to each individual data item, or to all data items as a group. Data Information ( 6 ) Data Items That Share Data Data items that share data provide us with a way to link or track data across programs, files, and applications. For example, a payment amount is divided among several categories of revenue accumulators, and added to year-to-date totals. The payment amount, revenue accumulators, and year-to-date totals provide us with a list of data items that share the payment amount. Data items that share data can be especially difficult to balance when the data items are updated by several programs, each with their own unique calculations. The References to Data-Names report is used to list the data definitions and procedure code references for the list of data items that share the payment amount. Analysis of data items that share data frequently leads to the discovery of inconsistent or redundant processing routines and calculations. Similar Data-names Analyzing similar data-names can help us in two ways: 1. Duplicate data-names often occur when a programmer copies a group of data definitions from one program into another, resulting in duplicate data definitions. This is a problem when a data area changes and the same change must be applied to several programs. The recommended solution is to move one of the data areas to a copybook, and replace each redundant data area with the new copybook. 2. In most situations similar data-names (e.g. DATE, DTE, YYMMDD) should be defined in a standard format (e.g. an eight digit integer). This is especially important with the year 2000 approaching. Similar data-names can be located on the All Data-Names screen, or the All Data-Names report which has the additional facility of matching data-names against a specified pattern or string, (e.g. all data-names that contain the string YYMM). Use the =DUPS filter option to produce a list of all duplicate data-names. Paragraph Information ( 7 ) Storing Functions in Paragraphs All programs contain control statements that dictate the program's structure, and paragraphs that contain business and technical processes. Well structured programs clearly separate the program's structural elements from the actual processing routines. There are important long term benefits associated with storing business and technical functions in one or more contiguous paragraphs: 1. Isolating functions in paragraphs and separating the functions from the program's control statements makes it easier to understand program structure, and to identify specific business and technical functions. 2. Functions located in paragraphs are easily incorporated into other programs by simply moving the paragraphs to a copybook, or building a single function program around them. Procedure Copybook Paragraph Usage Procedure copybooks provide a simple way to incorporate a function into several programs. However, when a procedure copybook is large and contains several functions, programs which utilize the copybook do not always use all the functions in the copybook. This results in programs that are larger and more complex than necessary. Large copybooks are good candidates for conversion into single function programs. This copybook-to-program conversion results in smaller, easier to maintain programs. Analyzing the paragraphs used by each program is greatly simplified by utilizing the References to Paragraph-Name report, which lists all references to each paragraph in the procedure copybook. Report options provide for listing the entire control verb (e.g. PERFORM with VARYING and UNTIL parameters) for each reference, or for producing a compressed report that lists only the program and line number of each reference. Once you have identified where each paragraph is referenced, you can split the copybook into smaller copybooks better suited to current functional requirements, or you can move functions into single function programs. Paragraph Information ( 8 ) Data Items Referenced in a Paragraph When you move one or more paragraphs to a copybook or single function program (SFP), you must know what data items are referenced by the paragraphs. This is especially important when moving paragraphs to an SFP because you must also decide which data items will be included in the LINKAGE SECTION of the SFP;ideally, they should be located in one or more copybooks. The task of determining which data items are referenced by selected paragraphs is greatly simplified by the Data Referenced Within Paragraph report, which lists the data definitions referenced within one or more selected paragraphs. Similar Paragraph-names Similar paragraph-names often occur when a programmer copies paragraphs from one program into another, resulting in similar or duplicate paragraph-names and procedure code. This becomes a problem when one paragraph is changed and the same change must also be applied to an unknown number of similar paragraphs located in other programs. The recommended solution is to move one of the paragraphs to a copybook, and replace each of the redundant paragraphs with the new copybook. This procedure produces processing routines which are more consistent and maintainable. Similar paragraphs can be located on the All Paragraph-Names screen, or the All Paragraph-Names report which can also match paragraph-names against a specified pattern or string (e.g. paragraph-names that contain the string SET-DIRECTORY). Use the =DUPS filter option to produce a list of all duplicate paragraph-names. Similar paragraphs are also identified on the References To Data-name and References to All Data-Names in Member reports because similar paragraphs usually contain references to the same data-items. This results in the paragraphs with similar names and data references being listed on the same report. Member Information ( 9 ) Member Descriptions Member descriptions are a simple but powerful facility for documenting source members (programs and copybooks) in a system. A single comment line describing the purpose of the program or copybook, is placed in each source member. As more single purpose programs are created, this facility becomes increasingly important for locating the program that accomplishes a specific business or technical task. Member descriptions can be thought of as an inventory of business and technical functions, record definitions, and linkage areas. The All Members report prints a list of all source members, their contents (i.e. program, data copybook, or procedure copybook), and the source member description. The All Members screen provides the facility to view this list interactively. Possible Member Description categories include: calc - Calculation routines file - File definitions conv - Data conversion routines tech - Technical routines Example of All Members Report: Member Description CMAT01 Proc Copy CALC: Calculate maturity date RMASTER Data Copy FILE: Client Master file (length=1466) JULGREG Proc Copy CONV: Convert julian to gregorian date PCIOMAST Program TECH: Client master file access routines (PC) Program Usage One of the fundamental goals of software management is to isolate business and technical functions into single function programs (SFP), which makes it increasingly important to know where each program (or function) is used. The Call Cross-Reference report lists where each program is referenced within the system. One important feature of this report is that by listing all USING parameters, you also know which data areas are required to CALL the program. For example: CALL 'PROGRAM' USING LINK-AREA-1 LINK-AREA-2 LINK-AREA-3 Member Information ( 10 ) Copybook Usage Data and procedure copybooks are an integral part of quality software. They eliminate redundant data areas, standardize file definitions, and simplify the utilization of existing business and technical routines. Potential problems associated with tracking large numbers of copybooks are avoided by using two reports: 1. The Copy Cross-Reference report lists where each copybook is referenced. When a copybook is changed, simply refer to this report for the list of programs that must be compiled to pick up the updated copybook. This same information is also available on the References to Member screen. 2. The All Members report lists all programs and copybooks in a system, along with a description of what the program or copybook is used for. When you are trying to locate a single function program, procedure copybook, or file definition, this report can be searched or sorted to find the appropriate source member. This information is also available on the All Members screen. System Analysis by Data Type Data copybooks provide valuable information about the role a specific group of data items, such as a client master file definition, plays in a system. We can obtain a manageable view of the system, as it relates to a selected group of data items, by ignoring everything in the system except the source code that have a direct impact on the selected group. Analysis by data copybook is provided by the References to Data-Names in Member report, which is available in two formats: 1. The single option lists the procedure references to each data-item in the copybook. This shows how each individual data item is used, and also shows if it is not referenced. 2. The group option lists the references to all data items together. This provides a more compact report with a single view of how the entire copybook is used. Member Information ( 11 ) Data Items Referenced Within Member When you want to utilize an existing procedure copybook in a program, you want to know the data areas the procedure copybook references. This is provided by the Data Referenced Within Member report, which lists all data definitions referenced within a procedure copybook. The Data Referenced Within Member report is also valuable as a quality assurance tool for analyzing procedure copybooks. Procedure copybooks usually reference data items that are defined within each program that utilize the copybook. This report can be used to check that common data items have identical data definitions in each program. If practical, duplicate data items should be replaced by a data copybook based on the data items. Storing Functions in Members Moving business and technical functions from programs to copybooks and single function programs (SFP), is an important maintenance function which has a powerful impact on the long term viability of an application system. Companies who have not taken advantage of this procedure, are now discovering how inefficient and error prone it is to update business processes duplicated in several source members. Finding and understanding each slightly different version of a process is extremely time-consuming. Compare this with the small effort required to update a business function located in an SFP. Single function programs and copybooks increase reliability and save money! Another problem with having duplicate functions is that each version typically has slight variations from the others. How do you determine which version is the correct one? The original coders are long gone, and as modifications have been applied to these original routines, they have become increasingly unstable and likely to produce erroneous results. When moving functions to procedure copybooks and single function program (SFP), consider that changing: þ A procedure copybook is faster when using a function in less than four programs. þ A SFP requires more time and analysis for the initial setup, but is the preferred solution when using a function in three or more programs. Additional Topics ( 12 ) Linking External to Internal System Elements Applications software is a mysterious black box with screens, reports, and data files providing its only connection to the outside (real?) world. Business oriented analysis begins with an external system element because that is the system to the business analyst and system user. Data goes in through this screen today...something happens...and the results are listed on a report tomorrow morning. Screens: With GPSA, we can display BMS maps as they are displayed in a CICS region. An internal-external link is provided by connecting the map fields on the screen to the associated application data items. Beginning with an external screen field we can proceed to track the associated internal data item throughout a system using the Data References screen or References to Data-Name report. In practice, this provides a method of identifying application data items by their description and position on a screen which is familiar to system users and business analysts, instead of their internal names. The BMS Map Reference report prints the BMS map definition, as displayed by CICS, along with each screen field, and the internal data items attached to the screen fields. Screens should also display a screen identifier to provide a link to the internal elements associated with the screen (e.g. the program managing the screen). Files: When we begin with a file, we can use the References to All Data-Names in Member report to track the data throughout the application. The file descriptions listed on the All Members report also connect the external file description with the source member containing the internal file definition. Reports: Reports should have a report identifier (e.g. R1012) to provide a link to internal elements that are connected to the report. For example, the report heading and detail lines could be called R1012-HEADING and R1012-DETAIL, and would be easily located on the All Data-Names screen and report. The member descriptions listed on the All Members screen and report also provides a link between reporting programs and their reports. For example: All Members Report: Member Description RPT1002 Program Weekly Client Delinquency Report (R1002) RPT1012 Program Monthly Client Delinquency Report (R1012) Additional Topics ( 13 ) Locate Selected Verbs This somewhat unusual way of analyzing parts of a system is useful when migrating to another hardware platform or changing compilers. Migrating an application from one hardware platform to another usually requires changes to the screen and file access methods. The Print Selected Verbs report has two report options that specifically support platform migration. They are: =IO, which lists all file access verbs, and =CICS, which lists all CICS commands. Applications that are designed to run on multiple platforms should have all screen and file access commands located in single function programs. Changing or upgrading COBOL compilers can cause problems with compiler specific COBOL extensions and obsolete COBOL constructs. The ability to locate these COBOL elements provides valuable input when assessing the impact of changing compilers. The Print Selected Verbs report lists source code containing selected COBOL and CICS commands. This report is also useful for locating errant DISPLAY, EXHIBIT, and ACCEPT commands in production systems. An interesting feature of the Print Selected Verbs reports is that it can be used to locate data tables. By selecting "subscript", the report lists all subscripted data references. Two important reasons for looking at data tables are: 1. The problems inherent in maintaining duplicate data tables in multiple locations. Duplicate data tables should be replaced with a single data table located in a copybook or SFP. 2. Invalid data problems due to a subscript value being less than or greater than the number of table entries. Most of these errors are not detectable when the error occurs. Ideally, subscripts should be tested to enure that their value is within the correct range for the table being accessed. The Print Selected Verbs report, used in combination with the References to Data-Name report, can quickly identify all tables and references to their subscripts to determine if each subscript has adequate range checking. Alphanumeric Literals in the Procedure Division Coding alphanumeric literals in the PROCEDURE DIVISION is an unnecessary programming practice that lends itself to inconsistent, poor quality code. Alphanumeric literals should be replaced with data items whenever practical. Some alphanumeric literals are more dangerous than others. For example, they should never be moved into data items that are used by more than one program. The Hard-Coded Literals report lists all the alphanumeric literals in a system for easy review. Please send your observations, comments and suggestions to: Director of Support Services COBOL Maintenance Technologies P.O. Box 122069 Chula Vista, CA 91912 With GPSA, software maintenance is more than "just changing programs", it's "managing your business and technical functions". This release of the General Purpose System Analyzer (GPSA) produces a total of 26 analysis and documentation reports. If we missed one that you need, please let us know. We will also work with you to produce solutions for specific software maintenance and re-engineering opportunities.